Planning in POMDPs Using Multiplicity Automata
نویسندگان
چکیده
Planning and learning in Partially Observable MDPs (POMDPs) are among the most challenging tasks in both the AI and Operation Research communities. Although solutions to these problems are intractable in general, there might be special cases, such as structured POMDPs, which can be solved efficiently. A natural and possibly efficient way to represent a POMDP is through the predictive state representation (PSR) — a representation which recently has been receiving increasing attention. In this work, we relate POMDPs to multiplicity automata — showing that POMDPs can be represented by multiplicity automata with no increase in the representation size. Furthermore, we show that the size of the multiplicity automaton is equal to the rank of the predictive state representation. Therefore, we relate both the predictive state representation and POMDPs to the well-founded multiplicity automata literature. Based on the multiplicity automata representation, we provide a planning algorithm which is exponential only in the multiplicity automata rank rather than the number of states of the POMDP. As a result, whenever the predictive state representation is logarithmic in the standard POMDP representation, our planning algorithm is efficient.
منابع مشابه
Robot Path Planning Using Cellular Automata and Genetic Algorithm
In path planning Problems, a complete description of robot geometry, environments and obstacle are presented; the main goal is routing, moving from source to destination, without dealing with obstacles. Also, the existing route should be optimal. The definition of optimality in routing is the same as minimizing the route, in other words, the best possible route to reach the destination. In most...
متن کاملLinks between multiplicity automata, observable operator models and predictive state representations: a unified learning framework
Stochastic multiplicity automata (SMA) are weighted finite automata that generalize probabilistic automata. They have been used in the context of probabilistic grammatical inference. Observable operator models (OOMs) are a generalization of hidden Markov models, which in turn are models for discrete-valued stochastic processes and are used ubiquitously in the context of speech recognition and b...
متن کاملApproximate Planning for Factored POMDPs using Belief State Simpli cation
We are interested in the problem of planning for factored POMDPs. Building on the recent results of Kearns, Mansour and Ng, we provide a planning algorithm for fac-tored POMDPs that exploits the accuracy-eeciency tradeoo in the belief state simplii-cation introduced by Boyen and Koller.
متن کاملImproved Planning for Infinite-Horizon Interactive POMDPs using Probabilistic Inference (Extended Abstract)
We provide the first formalization of self-interested multiagent planning using expectation-maximization (EM). Our formalization in the context of infinite-horizon and finitely-nested interactivePOMDP (I-POMDP) is distinct from EM formulations for POMDPs and other multiagent planning frameworks. Specific to I-POMDPs, we exploit the graphical model structure and present a new approach based on b...
متن کاملPlanning in Stochastic Domains: Problem Characteristics and Approximations (version Ii)
This paper is about planning in stochastic domains by means of partially observable Markov decision processes (POMDPs). POMDPs are di cult to solve and approximation is a must in real-world applications. Approximation methods can be classi ed into those that solve a POMDP directly and those that approximate a POMDP model by a simpler model. Only one previous method falls into the second categor...
متن کامل